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Abstract. This is a continuation of our earlier paper [27' on the universality 
of the eigenvalues of Wigner random matrices. The main new results of this 
paper are an extension of the results in I27| from the bulk of the spectrum 
up to the edge. In particular, we prove a variant of the universality results of 
Soshnikov |25| for the largest eigenvalues, assuming moment conditions rather 
than symmetry conditions. The main new technical observation is that there 
is a significant bias in the Cauchy interlacing law near the edge of the spectrum 
which allows one to continue ensuring the delocalization of eigenvectors. 



1. Introduction 

In our recent paper [27], a universality result (the Four Moment Theorem) was 
established for the eigenvalue spacings in the bulk of the spectrum of random Her- 
mitian matrices. (See [6| for an extended discussion of the universality phenome- 
non, and [27| for further references on universality results in the context of Wigner 
Hermitian matrices.) 

The main purpose of this paper is to extend this universality result to the edge of 
the spectrum as well. 

1.1. Universality in the bulk. To recall the Four Moment Theorem, we need 
some notation. 

Definition 1.2 (Condition CO). A random Hermitian matrix M„ = {Cij)i<i.j<n 

is said to obey condition CO if 



• The Qj are independent (but not necessarily identically distributed) for 
1 1^ i 1^ i < n. For 1 < i < j < n, they have mean zero and variance 1; for 
i = j, they have mean zero and variance c for some fixed c > independent 
of n. 

• (Uniform exponential decay) There exist constants C,C' > such that 
(1) P(|C.,|>i'')<exp(-<) 

for all t > C and 1 < i, j < n. 



T. Tao is is supported by a grant from the MacArthur Foundation, by NSF grant DMS-0649473, 
and by the NSF Waterman award. 

V. Vu is supported by research grants DMS-0901216 and AFOSAR-FA-9550-09-1-0167. 

1 



2 



TERENCE TAO AND VAN VU 



Examples of random Hcrmitian matrices obeying Condition CO include the GUE 
and GOE ensembles, or the random symmetric Bernoulli ensemble in which each of 
the Cij are equal to ±1 with equal probability 1/2. In GOE one has c = 2, but in the 
other two cases one has c = 1. The arguments in the previous paper [27| were largely 
phrased for the case c = 1, but it is not difficult to see that the arguments extend 
without difficulty to other values of c (the main point being that a modification 
of the variance of a single entry of a row vector does not significantly affect the 
Talagrand concentration inequality, [27, Lemma 43], or Lemma |2 . II below. ) . 

Given an n x n Hermitian matrix A, we denote its n eigenvalues as 

\l{A) <...<Xn{A), 

and write X{A) := (Ai(^), . . . , A„(A)). We also let ui{A), . . . , u„(A) e C" be an 
orthonormal basis of eigenvectors of A with Aui{A) — Xi{A)ui{A); these eigenvec- 
tors Ui{A) are only determined up to a complex phase even when the eigenvalues 
are simple, but this ambiguity will not cause a difficulty in our results as we will 
only be interested in the magnitude \ui{A)* X\ of various inner products Ui{A)*X 
of Ui{A) with other vectors X. 

It will be convenient to introduce the following notation for frequent events de- 
pending on n, in increasing order of likelihood: 

Definition 1.3 (Frequent events). Let E be an event depending on n. 

• E holds asymptotically almost surely iQ ?(£■) = 1 ~ o(l). 

• E holds with high probability if P{E) > 1 — 0(n^^) for some constant c > 0. 

• E holds with overwhelming probability if P{E) > 1 — Oc{n^'^) for every 
constant C > (or equivalently, that P(£') > 1 — exp(— aj(log n))). 

• E holds almost surely if P(i?) = 1. 

Definition 1.4 (Moment matching). We say that two complex random variables 
C and match to order k if 

ERe(C)'"Im(C)' = ERe(C')"Ini(C')' 
for all m,l > such that m + I < k. 

The first main result [27] can now be stated as follows: 

Theorem 1.5 (Four Moment Theorem). [27l Theorem 15] There is a small positive 
constant cq such that for every < e < 1 and k > I the following holds. Let 
Mn = iCij)i<i,j<n o,nd = {Cij)i<i.j<n &e two random matrices satisfying CO. 
Assume furthermore that for any 1 < i < j < n, andQ^ match to order 4, and for 
any 1 < i < n, Qi and C^'j match to order 2. Set An ^pnMn and A!^ :— -JnM'^, 
and let G : M.'' M. be a smooth function obeying the derivative bounds 

(2) |V^G(a;)| < n"" 

for all < j < 5 and x G R*^. Then for any en < ii < i2 ■ ■ ■ < ik ^ — and 
for n sufficiently large depending on e, k (and the constants C, C" in Definition ll.^) 



See Section [TTT9] for our conventions on asymptotic notation. 
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we have 

(3) |E(G(A., (A„), . . . , A,, (A„))) - E(G(A,, [A'J, A,, «)))| < 

// Qj and C,[^ only match to order 3 rather than 4, then there is a positive con- 
stant C independent of Cq such that the conclusion ^ still holds provided that one 
strengthens ^ to 

\V'G{x)\ < n-^^'"" 

for alio <j <5 and x € R'' . 

Informally, this theorem asserts that the distribution of any bounded number of 
eigenvalues in the bulk of the spectrum of a random Hermitian matrix obeying 
condition CO depends only on the first four moments of the coefficients. 

There is also a useful companion result to Theorem II .51 which is used both in the 
proof of that theorem, and in several of its applications: 

Theorem 1.6 (Lower tail estimates). [27, Theorem 17] Let < e < 1 be a constant, 
and let A/„ be a random matrix obeying Condition CO. Set An '■— y/nMn- Then for 
every co > 0, and for n sufficiently large depending on e, Cq and the constants C, C 
in Definition[T7^ and for each en < i < (l~e)n, one has Xi+i{An) — Xi{An) > n^'^° 
with high probability. In fact, one has 

P(A,+i(AO - A,(A„) < < n-=i 

for some ci > depending on cq (and independent of e). 

Theorem [m] (and to a lesser extent, Theorem lI.6|l can be used to extend the range 
of applicability for various results on eigenvalue statistics in the bulk for Hermitian 
or symmetric matrices, for instance in extending results for special ensembles such 
as GUE or GOE (or ensembles obeying some regularity or divisibility conditions) 
to more general classes of matrices. See [27], [13], [10] for some examples of this 
type of extension. We also remark that a level repulsion estimate which has a 
similar spirit to Theorem 11.61 was established in [9j Theorem 3.5], although the 
latter result establishes repulsion of eigenvalues in a fixed small interval /, rather 
than at a fixed index i of the sequence of eigenvalues, and does not seem to be 
directly substitutable for Theorem [L6] in the arguments of this paper. 

The results of Theorem 11.51 and Theorem 11.61 only control eigenvalues \i{An) in 
the bulk region en < i < (1 — e)n for some fixed e > (independent of n). The 
reason for this restriction was technical, and originated from the use of the following 
two related results (which are variants of previous results of Erdos, Schlein, and 
Yau[7l|8l|9]), whose proof relied on the assumption that one was in the bulk: 

Theorem 1.7 (Concentration for ESD). [27, Theorem 56] For any £,(5 > and 
any random Hermitian matrix Af„ = {Qij)i<i,j<n whose upper-triangular entries 
are independent with mean zero and variance \, and such that \C,ij\ < K almost 
surely for all i,j and some 1 < K < n^/^^^, and any interval I in [—2 + e, 2 — e] 
of width \I\ > ^ — -, the number of eigenvalues Nj of Wn '■— —j^Mn in I obeys 
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the concentration estimate 

\Ni -n J Psc{x) dx\ < 5n\I\ 
with overwhelming probability, where psc is the semicircular distribution 

(4) 




In particular, Nj — Q^(n\I\) with overwhelming probability. 

Proposition 1.8 (Delocalization of eigenvectors) . 27, Proposition 58] Let e, Mn,Wn,Cijj ^ 
be as in Theorem \1.7\ Then for any 1 < i < n with Xi{Wn) G [—2 + e,2 — e], if 
Ui(Wn) denotes a unit eigenvector corresponding to Xi(Wn), then with overwhelm- 

Tv'2 1 20 

ing probability each coordinate of Ui{Mn) is Os{ °y 2 " ) ■ 



In tlie bulk region [—2 + e, 2 — e], the semicircular function psc is bounded away 
from zero. Thus, Theorem 11.71 ensures that the eigenvalues of Wn in the bulk tend 
to have a mean spacing of 6e(l/ n) on the average. Applying the Cauchy interlacing 
law 

(5) kiW„) < K{W„-i) < A,+i(W„), 

where Wn-i is an n — 1 x n — 1 minor of this implies that the bulk eigenvalues 
of Wn-i are within Q^{l/n) of the corresponding eigenvalues of Wn on the average. 
Using linear algebra to express the coordinates of the eigenvector Ui{Mn) in terms 
of Wn and a minor Wn-i (see Lemma [4.11 below), and using some concentration 
of measure results concerning the projection of a random vector to a subspace (see 
Lemma [^TT|) . we eventually obtain Proposition ll.81 



1.9. Universality up to the edge. The main results of this paper are that the 
above four theorems can be extended to the edge of the spectrum (thus effectively 
sending e to zero). Let us now state these results more precisely. Firstly, we have 
the following extension of Theorem 11.71 



Theorem 1.10 (Concentration for BSD up to edge). Consider a random Hermitian 
matrix Mn = (Cij)i<i,i<n whose upper-triangular entries are independent with mean 
zero and variance 1, and such that |^y| < K almost surely for all i,j and some 
K > 1. Let 0<6<l/2bea quantity which can depend on n, and let I be an 
interval such that 

^^!iogS. 

We also make the mild assumption K — o(n^/^(5^). Then the number of eigenvalues 
Nj of Wn '■— -^Mn in I obeys the concentration estimate 

\Ni ~n j psc{x) dx\ < 6n\I\ 



with overwhelming probability. 
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Remark 1.11. The powers of K, S and logn here are probably not best possible, but 
this will not be relevant for our purposes. In our applications K will be a power of 
logn, and 6 will be a negative power of logn. (This allows the error term 0((5n|/|) 
in the above estimate for Nj to exceed the main term n Jj p^dx) dx when one is 
very near the edge, but this will not impact our arguments.) 

We prove this theorem in Section[31 using the same (standard) Stieltjes transform 
method that was used to prove Theorem 11.71 in [27 (see also [9,), with a some- 
what more careful analysis. We next use it to obtain the following extension of 
Proposition 1 1.81 

Proposition 1.12 (Delocalization of eigenvectors up to the edge). Let Mn be a 

random matrix obeying condition CO. Then with overwhelming probability, every 
unit eigenvector Ui{Mn) of Mn has coefficients at most n^^^'^ log'"'^^^ n, thus 

sup |ui(M„)*ej| < n~i/Mog°(i) n 

l<ij'<n 

where ei, . . . , e„ is the standard basis. 

The deduction of Proposition ll.l2l from Theorem 1 1 . 1 01 differs significantly from the 
analogous deduction of Proposition [LSI in Theorem 1 1.71 in ^27j . The main difference 
is that in the current case psc is no longer bounded away from zero, which causes the 
average eigenvalue spacing between Ai(W„) and Xi+i{Wn) to be considerably larger 
than 1/n. For instance, the gap between the second largest eigenvalue A„_i(W„) 
and the largest eigenvalue A„(VF„) is typically of size n~'^/'^. 

The key new ingredient that help us to deal with this problem is the following 
observation: the Cauchy interlacing law ([5]), when applied to the eigenvalues of 
the edge, is strongly bias. In particular, the gap between \i{Wn-i) and \i{Wn) is 
significantly smaller than the gap between Ai(W„_i) and \i+i{Wn). We can show 
that (with high probability), the first gap is of order n"^'*""^^'' while the second can 
be as large as n^'^/^ (and similarly for the gap between Xi+iiWn) and Ai(W„_i) 
when n/2 < i < n). This new ingredient will be sufficient to recover Proposition 
I1.12| see Section m where the above proposition is proved. 

Using Theorem 11.101 and Proposition 11.121 one can continue the arguments from 
[27] to establish the following extensions of Theorem 11.51 and Theorem 11.61 

Theorem 1.13 (Four Moment Theorem up to the edge). There is a small positive 
constant cq such that for every k > 1 the following holds. Let Mn = iCij)i<i,j<n 
and AI^ — {Cij)i<i.j<n be two random matrices satisfying CO. Assume furthermore 
that for any 1 < i < j < n, Qj and C,[j match to order 4 and for any 1 < i < n, Qi 
and Cj'j match to order 2. Set An '■= \/nAIn and A'^ := y/nM^, and Zei G : M*^ — > M 
be a smooth function obeying the derivative bounds ^ for all < j < 5 and x E R'' . 
Then for any 1 < ii < 12 ■ ■ ■ < ik ^ n, and for n sufficiently large depending on k 
(and the constants C,C' in Definition \L2\) we have 1^. If Cij O'l^'d dj only match 
to order 3 rather than A, then there is a positive constant C independent of Cq such 
that the conclusion ^ still holds provided that one strengthens ^ to 

(6) |V^G'(x)| < n^^^'"" 
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for all < j < 5 and x £ R'' . 

Theorem 1.14 (Lower tail estimates up to the edge). Let M„ be a random matrix 
obeying Condition CO. Set An '■— \/nMn. Then for every cq > 0, and for n 
sufficiently large depending on e, cq and the constants C,C' in Definition ll.Sl and 
for each 1 < i < n, one has Ai+i(^„) — Xi(An) > rt^^" with high probability, 
uniformly in i. 

The novelty here is that we have no assumption on the indices ij and i. We present 
the proof of these Theorems in Sections El [U following the arguments in [27] closely. 

1.15. Applications. As Theorems I1.13[ 11.141 extend Theorems [TSl [L6l all the 

applications of the latter theorems in [27] (concerning the bulk of the spectrum) 
can also be viewed as applications of these theorems. But because these results 
extend all the way to the edge, we can now obtain some results on the edge of the 
spectrum as well. For instance, we can prove 

Theorem 1.16. Let k be a fixed integer and Mn be a matrix obeying Condition 
CO, and suppose that the real and imaginary part of the atom variables have the 
same covariance matrix as the CUE ensemble (i.e. both components have variance 
1/2, and have covariance 0). Assume furthermore that all third moments of the 
atom variables vanish. Set Wn ■= -^Mn. Then the joint distribution of the k 
dimensional random vector 

(7) ({\n{Wn) ~ 1WI\ . . . , (A„_fe+i(W/„) - 2)^2/3) 

has a weak limit as n oo, which coincides with that in the CUE case (in particu- 
lar, the limit for k = I is the CUE Tracy- Widom distribution[28 , and for higher k 
is controlled by the Airy kernel [14] ). The result also holds for the smallest eigen- 
values Ai, . . . , Afe, with the offset —2 replaced by +2. 

If the atom variables have the same covariance matrix as the COE ensemble (i.e. 
they are real with variance 1 off the diagonal, and 2 on the diagonal), instead of 
the CUE ensemble, then the same conclusion applies but with the CUE distribution 
replaced of course by the COE distribution (see [53] for the k = 1 case). 

This result was previously established by Soshnikov [25] (see also [23], [24]) in the 
case when M„ is a Wigner Hermitian matrix (i.e. the off-diagonal entries are iid, 
and the matrix matches GUE to second order at least) with symmetric distribution 
(which implies, but is stronger than, matching to third order). For some additional 
partial results in the non-symmetric case see [20], [H]. The exponential decay 
condition in Soshnikov's result has been lowered to a finite number of moments; 
see [22], [IB]- It is reasonable to conjecture that the exponential decay conditions 
in this current paper can similarly be lowered, but we will not pursue this issue 
here. It also seems plausible that the third moment matching conditions could be 
dropped, though this is barely beyond the reach of the current methocH. 

'^Note added in proof: the third moment condition has recently been dropped in 1161 . by 
combining the four moment theorem here with a new proof of universaUty for the distribution of 
the largest eigenvalue for gauss divisible matrices. 
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Proof. Wc just prove the claim for the largest k eigenvalues and for GUE, as the 
claim for the smallest k and/or GOE is similar. 

Set An '■— y/nMn- It suffices to show that for every smooth function G : M'^' — >■ M, 
that the expectation 

(8) EG((A„(A„) - 2n)/ni/3, . . . , (A„_fe+i(A„) - 2n)/ni/3) 

only changes by o(l) when the matrix Af„ is replaced with GUE. But this follows 
from the final conclusion of Theorem II. 131 thanks to the extra factor n^^/"^. □ 

Remark 1.17. Notice that there is some room to spare in this argument, as the 
gg^jjj Q jg fg^j. niore than is needed for ([6|). Because of this, one can obtain 
similar universality results for suitably normalised eigenvalues Ai(A„) with i < n^~^ 
or i > n — n^^^ for any e > (where the normalisation factor for Ai(A„) is now 
n^/^ min(i, n — i)^/'^, and the offset —2 is replaced by — t, where Psc{x) dx = ^). 
We omit the details. 

Remark 1.18. In analogy with [13] , one should be able to drop the third moment 
condition in Theorem 11.161 if one can control the distribution of the largest (or 
smallest) eigenvalues from random matrices obtained from a suitable Ornstein- 
Uhlenbeck process, as in [12] . 

1.19. Notation. We consider n as an asymptotic parameter tending to infinity. We 
use X < y, y > X, r = n{X), oiX = 0{Y) to denote the bound X < CY for all 
sufficiently large n and for some constant C. Notations such as X Y, X = Ok (Y) 
mean that the hidden constant C depend on another constant k. X ~ o(Y) or 
Y = uj{X) means that X/Y ^ as n — > oo; the rate of decay here will be allowed 
to depend on other parameters. We write X = Q{Y) for Y <^ X <^ Y . 

We view vectors x G C" as column vectors. The Euclidean norm of a vector 
a; € C" is defined as ||a;|| :— {x*x)^/'^. 

Eigenvalues are always ordered in increasing order, thus for instance A„(A„) is the 
largest eigenvalue of a Hermitian matrix An, and Ai(A„) is the smallest. 

1.20. Acknowledgements. The authors thank the anonymous referee for helpful 
comments and references, and Horng-Tzer Yau for additional references. 



2. General tools 

In this section we record some general tools (proven in [27 ) which we will use 
repeatedly in the sequel. We begin with a very useful concentration of measure 
result that describes the projection of a random vector to a subspace. 

Lemma 2.1 (Projection Lemma). Let X = (^i, . . . ,^„) e C" be a random vector 
whose entries are independent with mean zero, variance 1, and are hounded in 
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magnitude by K almost surely for some K, where K > 10(E|^|* + 1). Let H be a 
subspace of dimension d and tth the orthogonal projection onto H . Then 

PdllTTHWII ~Vd\>t)< 10exp(-^). 
In particular, one has 

\\tth{X)\\ =Vd + 0{Klogn) 
with overwhelming probability. 

The same conclusion holds (with 10 replaced by another explicit constant) if one 
of the entries of X is assumed to have variance c instead of 1, for some absolute 
constant c > 0. 



Proof. See 127!, Lemma 40] . (The main tool in the proof is Talagrand's concentration 
inequality.) It is clear from the triangle inequality that the modification of variance 
in a single entry does not significantly affect the conclusion except for constants. □ 



Next, we record a crude but useful upper bound on the number of eigenvalues in 
a short interval. 

Lemma 2.2 (Upper bound on ESD). Consider a random Hermitian matrix Af„ = 
{Cij)i<i,j<n whose upper-triangular entries are independent with mean zero and 
variance 1 (with variance c on the diagonal for some absolute constant c> Q), and 
such that < K almost surely for all i,j and some K > 1. Set Wn -^Mn- 

Then for any interval / C M with \I\ > " , 

Ni < n\I\ 

with overwhelming probability, where Nj is the number of eigenvalues of Wn in I. 



Proof. See [2?| Proposition 62]. (The main tools in the proof are the Stieltjes 
transform method, Lemma 13.31 below, and Lemma 12.11 ) Again, the generalisation 
to variances other than 1 on the diagonal do not cause significant changes to the 
argument. □ 



Finally, we recall a Berry-Esseen type theorem: 

Theorem 2.3 (Tail bounds for complex random walks). Let 1 < N < n be integers, 
and let A — {aij)i<i<N:i<j<n be an N x n complex matrix whose N rows are 
orthonormal in C", and obeying the incompressibility condition 

(9) sup \a,,j \ < a 

l<i<Na<j<n 

for some cr > 0. Let Ci, . . . , C„ be independent complex random variables with mean 
zero, variance E|^jp equal to 1, and obeying E|Cip < C for some C > 1. For each 
1 < i < N , let Si be the complex random variable 

n 

Si :— 
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and let S be the -valued random variable with coefficients Si, ... , Sn. 

• (Upper tail bound on Si) For t >1, we have P{\Si \ > t) <C exp(— ct^) + Ca 
for some absolute constant c > 0. 

• (Lower tail bound on S) For any t< Vn, onehasP{\S\ < i) < 0(t/VN)^^/^i + 
CNH-^a. 

The same claim holds if one of the Q is assumed to have variance c instead of 1 
for some absolute constant c > 0. 

Proof. See [271 Theorem 41]. Again, the modification of the variance on a single 
entry can be easily seen to have no substantial effect on the conclusion. □ 



3. ASYMPTOTICS FOR THE ESD 



In this section we prove Theorem I1.10| using the Stieltjes transform method (see 
[2] for a general discussion of this method). We may assume throughout that n is 
large, since the claim is vacuous for n small. 

It is known by the moment method (see e.g. [2] or [1]) that with overwhelming 
probability, all eigenvalues of Wn have magnitude at most 2 + o(l) . Because of this, 
we may restrict attention to the case when / lies in interval [—3, 3] (say). 

We recall the Stieltjes transform s„(z) of a Hermitian matrix Wn, defined for 
complex z by the formula 

We also introduce the semicircular counterpart 

s{z) := / Psc{x) dx 

which by a standard contour integral computation can be given explicitly as 

(11) s(z) = i(-z + V^^) 

where we use the branch of the square root of z^ — 4 with cut at [—2, 2] which is 
asymptotic to z at infinity. 

It is well known that one can control the empirical spectral distribution Ni via 
the Stieltjes transform. We will use the following formalization of this principle: 

Lemma 3.1 (Control of Stieltjes transform implies control on ESD). There is a 
positive constant C such that the following holds for any Hermitian matrix Wn • Let 
l/lQ>ri>\/n and L,e,6 > 0. Suppose that one has the bound 

(12) \Sn{z) - Siz)\ < S 
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with (uniformly) overwhelming probability for all z with |Re(2)| < L an(ilm(z) > 77. 
Then for any interval I in [—L + e, L — e] with \I\ > max{2ri, j log one has 



\Ni-n J psc{x) dx\ <e 5n\I\ 
with overwhelming probability, where Nj is the number of eigenvalues of Wn in I. 



Proof See 03 Lemma 60]. □ 



As a consequence of this lemma (with L = 4 and £ = 1, say), we see that Theorem 
11.101 follows from 

Theorem 3.2 (Concentration for the Stieltjes transform up to edge). Consider 
a random Hermitian matrix Mn — {Cij)i<i.j<n whose upper-triangular entries are 
independent with mean zero and variance 1, with variance c on the diagonal for 
some absolute constant c > 0, and such that \Cij\ < K almost surely for all i,j 
and some K > 1. Set Wn '■= -^Mn- Let < 6 < 1/2 (which can depend on n), 

and suppose that K = o(n^/^i5^). Then ([T^ holds with (uniformly) overwhelming 
probability for all z with |Re(z)| < 4 and 

Im(z) > -s- . 



The remainder of this section is devoted to proving Theorem 13.21 Fix z as in 
Theorem 13.21 thus |Re(z)| < 4 and Im(z) = 77, where 

(13) r]n> ^ . 

Our objective is to show (jl2p with (uniformly) overwhelming probability. 

As in previous works (in particular, [51 ^7\), the key is to exploit the fact that 
when Imz > 0, 5(2:) is the unique solution to the equation 

(14) s{z) + —^ = 

^ ^ ^ ' s{z) + z 

with Ims(z) > 0; this is immediate from (|lll) . 

We now seek a similar relation for s„. Note that Ims„(z) > by PH)) . We use the 
following standard matrix identity (cf. [571 Lemma 39], or [51 Chapter 11]): 

Lemma 3.3. We have 

1 " 

(15) Sn{z)^-J2- 
where 

Yk := al{Wn,k - ziy^ak, 
Wn.k is the matrix Wn with the fc*'* row and column removed, and Ok is the k*^^ row 
of Wn with the fc*'' element removed. 



n ^ — ' -^Ckk — z ~ Yk 
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Proof. By Schur's complement, is the /c*'' diagonal entry of 

{W — zl)^-^. Taking traces, one obtains the claim. □ 

Proposition 3.4 (Concentration of Yk). For each 1 < k < n, one has Yfc = 

s„(z) + o{S'^) with overwhelming probability. 

Proof. Fix k, and write z = x + \/—lrj. 

The entries of ak are independent of each other and of Wn,k, and have mean zero 
and variance ^. By linearity of expectation we thus have, on conditioning on Wn^k 

EiYk\Wr,,k) = - trace(VK„,fc - zl)-' = (1 - -)s„,fc(z) 
n n 

where 

1 1 

is the Stieltjes transform of Wn.k- From the Cauchy interlacing law ([S|) and (IT51) . 
we have 

sniz) - (1 - -)s„^fc(z) ^o(- I dx\ - O (—\ = o{6^). 



It follows that 

^{Yk\WnA ^ s.n{z) + o{5'') 
and so it will remain to show the concentration estimate 

Ffc - E(yfc|W„,fc) = 0(^2) 

with overwhelming probability. 
Rewriting Yfc, it suffices to show that 

ri-l 

(16) y — r-^T = o{5^) 

with overwhelming probability, where Rj :— \uj(Wn.k)*o,k\'^ ^ 1/n. 
Let 1 < i- < i+ < n, then 

where H is the space spanned by the Uj{Wn.k)* for i- < j < i+. From Lemma [2T 
and the union bound, we conclude that with overwhelming probability 

— i^K log n + log^ n 



(17) I E 



By the triangle inequality, this implies that 

IIP ||2^^ H-^- I ^/i+ - i-K\ogn + K'^log^ n 
^-^ n n 
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and hence by a further apphcation of the triangle inequality 

(18) y: m « ^-^^^^^^^^^^^^ 

with overwhelming probability. 

The plan is to use (IT7|) and (|18p to establish (IT51) . Accordingly, we split the LHS of 
p^ . into several subsums according to the distance \Xj — x\. Lemma [2.21 provides 
a sharp estimate on the number of terms of each subsum which will allow us to 
obtain a good upper bound on the absolute value. 

We turn to the details. From ([T^ we can choose two auxiliary parameters < 
S',a < 1 such that 

S' = o{5^); 
alogn = o((5^); 

(19) aS'rjn > K'^ log^ n; 

Klogn 2n 

r^j— = o{S ). 

y/ao'rin 

Indeed, one could set S' := 6"^ log^" *^^ n and a := 6^ log^^ *'^ n and use 

Fix such parameters, and consider the contribution to (|16p of the indices j for 
which 

\XjiWn)-x\<6'7J. 

By Lemma 12.21 and ([T9)) . the interval of j for which this occurs has cardinality 
0{5'rjrL) (with overwhelming probability). On this interval, the quantity ■ 



has magnitude 0(^)- Applying ([T^ (and p^ ). we see that the contribution of this 
case is thus 

1 S'nn , ^n, 

< — = o(S^) 

7] n 

which is acceptable. 

Next, we consider the contribution to (IT5|) of those indices j for which 
(1 + ayS'r] < \Xj{Wn) - a;| < (1 + a)'+M'?7 

for some integer < Z <C logn/a, and then sum over I. By Lemma [2.21 and p9| . 
the set of j for which this occurs is contained (with overwhelming probability) 
in at most two intervals of cardinality 0((1 + aYaS'rin). On each of these inter- 
vals, the quantity . — — , , . , has magnitude O ( , \,;-, ] and fluctuates 

O ( (i+a)'i5't; ) • Applyii^g pT| . (HH) (and noting that (1 + afaS'-qn exceeds 
iC^ log^ n, by P9|) ') we see that the contribution of a single I to (flBl) is at most 

1 \J a{l + ayS'rjnK log n a a(l + a)'(5'77?i 

(1 + ayS'r] n (1 + ayS'r] n 
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which simphfies to 

a 



, 1 In K loff n o 
\/ OL0'r\n 



Summing over I we obtain a bound of 

K log n 
< , + a log n 

which is acceptable by p9|) . □ 



We now conclude the proof of Theorem II .101 By hypothesis, 



almost surely. Inserting these bounds into (jl5p . we see that with overwhelming 
probability 

1 " 1 

By the triangle inequality (and square rooting the o() decay), we can assume that 
either the error term o((5^) is o((5^|s„(z) + zj), or that |s„(z) + z| is o(l). Suppose 
the former holds. Then by Taylor expansion 

1 _ 1 2n 

s„(z) + z + o((52) ~ s„(z) + z 

and thus 

Sn\z) + Z 

If we assume \z\ < 10 (say), we conclude that |sn(2)| < 100. Multiplying out by 
Sn{z) + z and rearranging, we obtain 

Z2 ^^-4 



(.„(z) + -)2 = ^+o(<52 



Thus 



sn(z) + | = ±y^^ + o(5) 



(treating the case when — 4 = 0(52 ) separately) . 

To summarise, we have shown (with overwhelming probability) that in the region 
|z|<10; |Re(z)|<4; Im(z) > 

that one either has s„(z) — s(z) + o((5), s„(z) — — z — s(z) + o(l) = s(z) — ^/z^ — 4 + 
0(1), or |s„(z) + z| = 0(1). It is not hard to see that the first two cases are 
disconnected from the third (for n large enough) in this region, because s(z) ~ 
is bounded away from zero, as is s(z) + z = Furthermore, the first and 
second possibilities are also disconnected from each other except when z^ — 4 = 
0(52). Also, the second and third possibilities can only hold for Im(z) = o(l) since 
Sn(z) and z both have positive real part. A continuity argument thus shows that 
the first possibility must hold throughout the region except when z^ — 4 = o(S^)^ 
in which case either the first or second possibility can hold; but in that region, the 
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first and second possibility are equivalent, and (|T2|) follows. The proof of Theorem 
11.101 is now complete. 



4. Delocalization of eigenvectors 

Without loss of generalization, we can assume that the entries are continuously 
distributed. Having established Theorem ll.lO[ we now use this theorem to establish 
Proposition 11.121 

Let Mn obey condition CO. Then by Markov's inequality, one has |Cij | ^ log'^'^-' n 
with overwhelming probability (here and in the sequel we allow implied constants 
in the 0{) notation to depend on the constants C, C" in H])). By conditioning the 
Cij to this even10, we may thus assume that 

(20) \aj\<K 

almost surely for some K = 0(log°(i^ n). 

Fix 1 < « < n; by symmetry we may take i > n/2. By the union bound and 
another application of symmetry, it suffices to show that 

|u.(A/n)*ei|«n-i/2log°«n 

with overwhelming probability. 

To compute Ui(M„)*ei we use the following identity from [7] (see also [27l Lemma 
38]): 

Lemma 4.1. Let 

(a X* 

be a n X n Hermitian matrix for some a E M. and X G C"~^ . and let \"" ] he a unit 



"^n — 1 



eigenvector of A with eigenvalue \i{A), where a; G C and v G C" . Suppose that 
none of the eigenvalues of An-i are equal to Xi{A). Then 

1 



\x\' = 



1 + E"=i (Ai(^n^i) - A,(A„))-2|j.,(A„_i)*X|2 



where Mj(j4„_i) is a unit eigenvector corresponding to the eigenvalue Aj(A„_i). 

Proof. By subtracting Xi{A)I from A we may assume Xi{A) = 0. The eigenvector 
equation then gives 

xX + An-iv' = 0, 

thus 

v' = ~xA-'_,X. 



■^Strictly speaking, this distorts the mean and variance of <^ij by an exponentially small amount, 
but one can easily check that this does not significantly impact any of the arguments in this section. 
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\x\'{l + \\A-\xf) = l. 
Since \\A-\Xf ^ E?=i (^i(^«-i))"^|uj(A«-i)*^P, the claim follows. □ 



Let Mn-i be the bottom right n — 1 x n — 1 minor of M„. As we are assuming 
that the coefficients of M„ are continuously distributed, we see almost surely that 
none of the eigenvalues of M„_i are equal to Ai(M„). We may thus apply Lemma 
4.11 and conclude that 

\u,{MnTei\- ^ 



-L + Z^j = l (A,-(M„_i)-Ai(M„))2 



where X is the bottom left n — 1 x 1 vector of iV/„ (and thus has entries C,ji for 
1 < j < n). It thus suffices to show that 



y \-M^^r.-iyx\- ».iog-o(i)^ 

^ (A,(M„_i)-A,(M„))2 ^ 



3 

with overwhelming probability. 

It will be convenient to eliminate the exponent 2 in the denominator, as follows. 
From Lemma [2?T1 one has \uj{Mn~i)* X\ <ti log*^*-^-* n with overwhelming probabil- 
ity for each (and hence for all j, by the union bound). It thus suffices to show 
that 

g(A,(M„_,)-A.(M„))^»"l°S 

with overwhelming probability. By the Cauchy-Schwarz inequality, it thus suffices 
to show that 

with overwhelming probability for some 1 < T^,T+ <^ log*^^""^-* rt. It is convenient 
to work with the normalized matrix Wn ■= -^Mn, thus we need to show 

(21) V MWn-iYYl^ 0(1) ^ 

with overwhelming probability for some 1 < T_, T+ < log°(^) n, where Y := -^X 
has entries -^Cji for 1 < J ^ 

There are two cases: the bulk case and the edge case; the former was already 
treated in [27], but the latter is new. 
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4.2. The bulk case. Suppose that n/2 < i < 0.999n. Then from the semicircular 
law (or Theorem I1.10|) we see that Xi{Wn) G [—2 + e,2 + e] with overwhelming 
probability for some absolute constant e > 0. Let A be a large constant to be chosen 
later. A further application of Theorem 11.101 then shows that there is an interval 
/ of length \og^ n/n centered at Xi{Wn) which contains 8(log"*n) eigenvalues of 
Wn- If Xj{Wn), Xj+i{Wn) lie in /, then by the Cauchy interlacing property ([U, 
\\j{Wn-i) — Ai(W„)| ^ log^ n/n. One can thus lower bound the left-hand side of 
pT|) (for suitable values of T) by 

»nlog-'^n J2 \ujiWn-i)*Y\\ 

i:A,(W„),A, + i(W„)e/ 

One can rewrite this as log^'^ n||7r^fX||^, where H is the span of the Uj{Wn-i) 
for Xj{Wn), Xj+i{Wn) G /. The claim then follows from Lemma [2.11 (for A large 
enough) . 



4.3. The edge case. We now turn to the more interesting edge case when 0.999n < 
i < n. Using the semicircular law, we now see that 

(22) A,(W„) > 1.9 

(say) with overwhelming probability. 

Next, we can exploit the following identity: 

Lemma 4.4 (Interlacing identity). [27l Lemma 37] If Uj{Wn-i)* X is non- zero for 
every j , then 

^ Aj(W^„_i) - Xi[Wn) \Jn 



Proof. By diagonalising Wn-^i (noting that this does not affect either side of (|23p ). 
we may assume that Wn-i = diag(Ai (ty„_i), . . . , A„_i(W„_i)) and Uj{Wn-i) = ej 
for j = l,...,n — 1. One then easily verifies that the characteristic polynomial 
det(iy„ — XI) of Wn is equal to 

1 ^ ^^ \uj{Wn-irX\\ 



lli^AWn-l) - m^Cnn - X) - 
J = l j=l 



Xj{Wn-l)-X 



when A is distinct from Ai(W^„_i), . . . , A„_i(W„_i). Since Uj{Wn-i)* X is non-zero 
by hypothesis, we see that this polynomial does not vanish at any of the Xj{W„^i). 
Substituting Xi{Wn) for A, we obtain (1^ . □ 



Again, the continuity of the entries of M„ ensure that the hypothesis of Lemma 
is obeyed almost surely. From (1^0]) . ([2^ . ([25]) one has 



I > 1.9- o(l) 



^ |u,(W„_i)*A 



^ A,(l^„_i)-A,(M^„) 
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with overwhelming probabihty, so to show (|21|) . it will suffice by the triangle in- 
equality to show that 

j>i+T+ or j<i-T^ ■'^ " 

(say) with overwhelming probability for some T = log*^*-^' n. 



Let A > 100 be a large constant to be chosen later. By Theorem 11.101 we see (if 
A is large enough) that 

(25) Ni ^ nai\I\ + 0{\I\nlog'^/'^° n) 

with overwhelming probability for any interval / of length |/| — \og^ n/n, where 
aj :— Jj pscix) dx. For any such interval, we see from Lemma |2. II (and Cauchy 
interlacing ([5])) that with overwhelming probability 

Nj ^/log^/2+^Wn\ 



j:A,(iy„-i)e/ \ 
and thus by ([25| (for A large enough) 

J2 k,(W^„-i)*Xp = + 0(|/| log-^/2°) : 
Set di dist(A,^tv„)j) ^ ^ j^g^ ^g^y^^ ^j^g^ 

' O 



XjiWn-l) - KiWn) di\I\ \dj\I\ 

for all j in the above sum, thus 

We now partition the real line into intervals / of length \og^ n/n, and sum (j26p 
over aU / with di > \ogn. Bounding aj crudely by 0(1), we see that J2i O (^j) = 

O (^i^^^ = o(l). Similarly, one has 

EQ ( ^"Cr^ ) = Oilog-^/''nlogn) ^ 0(1) 
if A is large enough. Finally, Riemann integration of the principal value integral 



p.v. 

shows that 



dx := lim / _^sc{±_ 



-lX- \{Wn) ^'^^ J\x\<2:\x-\,(W^)\>€ ^ " Aj(W"„) 



> —=p.v. ———dx + o(l) 

^ dj J-2^-\iWn) 
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The operator norm of Wn is 2 + o(l) with overwhelming probabihty (see e.g. [2], 
[4]), so |Ai(W„)| < 2 + o(l). Using the formula (fTTj) for the Stieltjes transform, one 
obtains from residue calculus that 



for |A,(VK„)| < 2, with the right-hand side replaced by -A,(W^„)/2+v^A,(W"„)2 - 4/2 
for |Aj(W„)| > 2. In either event, we have 

J -2 X - \i[Wn) 

Putting all this together, we see that 

I y y M^n~^r^\' |<i + .(i). 

The intervals / with dj < logn will contribute at most log"^^*^'^-* n eigenvalues, by 
(P5|l (and Cauchy interlacing ([S])). The claim now follows by setting r_ and 
T-|_ appropriately. The proof of Proposition 11.121 is now complete. 



Remark 4.5. From (|21|) and Lemma [2. II one sees that 

with overwhelming probability for all n/2 < i < n, and similarly one has 

\\(Wn-i) - K{Wn)\ « log^^') n/n 

with overwhelming probability for all 1 < « < n/2. On the other hand, according to 
the Tracy- Widom law, the gap between A„(W„) and A„_i(Wti) (or between Ai(W„) 
and A2(VF,i)) can be expected to be as large as n~^/^. Thus we see that there is 
a significant bias at the edge in the interlacing law ([5|), which can ultimately be 
traced to the imbalance of "forces" in the interlacing identity at that edge. 



5. Lower bound on eigenvalue gap 



We now give the proof of Theorem 11.141 Most of the proof will follow closely 
the proof of Theorem 11.61 in [27 , so we shall focus on the changes needed to that 
argument. As such, this section will assume substantial familiarity with the material 
from '27], and will cite from it repeatedly. Similarly for the next section. 

For technical reasons relating to an induction argument, it turns out that one has 
to treat the extreme cases i = l,n separately: 

Proposition 5.1 (Extreme cases). Theorem \1.14\ is true when i — I or i = n. 



Proof. By symmetry it suffices to do this for i = n. By a limiting argument we 
may assume that the entries Qj of M„ are continuously distributed. From Lemma 
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14.41 one has (almost surely) that 

Recall that A„(Wn) = 2 + o(l) with overwhelming probability; also, -^Cnn = o{l) 
with overwhelming probability. As all the terms in the left-hand side have the same 
sign, we conclude that 

|«„_i(W„_i)*X|2 



|A„-i(T^„-i)- A„(W„)| 



< 1. 



Prom Theorem l2.3l and Proposition 1 1.1 21 we have |m„_i(W„_i)*X| > (say) 
with high probability, and so 

|A„-i(W„_i)-A„(W^„)| 

with high probability. The claim now follows from the Cauchy interlacing property 
(0. □ 

Remark 5.2. In fact, at the edge, one should be able to improve the lower bound 
on the eigenvalue gap substantially, from n~'^° to n^^^~'^°, in accordance to the 
Tracy- Widom law, but we will not need to do so here. 



Now we handle the general case of Theorem II. 141 Fix M„ and cq. We write uq, iq 
for i,n, thus 1 < io < no and our task is to show that 

with high probability. By Proposition l5 . 1 1 we may assume 1 < ig < uq. We may also 
assume rig to be large, as the claim is vacuous otherwise. As in previous sections, 
we may truncate so that all coefficients Qj are of size 0(log'^(^) no) (as before, the 
exponentially small corrections to the mean and variance of Qj caused by this are 
easily controlled), and approximate so that the distribution is continuous rather 
than discrete. 

Por each no/ 2 < n < no, let An be the top left n x n minor of An^. As in p71 
Section 3.4], we introduce the regularized gap 

(27) gtj,n ■■= im ; — —5-5 — , 

i<i-<z^i<z<i+<n min(i+ - i_,log^' no)i°s" ■'"0 

for all no/2 < n < no and l<i — l<i<n, where C'l is a large constant to be 
chosen later. It will suffice to show that for each 1 < zo < '^Oj that 

with high probability. By symmetry we may assume that no/2 < «o < "-o- 

As before, we let ui(A„), . . . , u„(yl„) be an orthonormal eigenbasis of An associ- 
ated to the eigenvectors Ai(^„), . . . , A„(^„). We also let Xn e C" be the rightmost 
column of An+i with the bottom coordinate ^/n(n+l,n+l removed. 
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We will need two key lemmas. First, we have the following deterministic lemma 
(a minor variant of j271 Lemma 47]), showing that a narrow gap can be propagated 
backwards in n unless one of a small number of exceptional events happen: 

Lemma 5.3 (Backwards propagation of gap). Suppose that io < n + \ < uq and 
I < n/10 is such that 

(28) 9^o.J,n+l < S 

for some < 6 <1 (which can depend on n), and that 

(29) A„+i(A„+i) - \n{An+i) > <5exp(log°-'i no). 
Then io < n. Suppose further that 

(30) 

for some m > with 

(31) 2'" < (5-1/2. 

Then one of the following statements hold: 

(i) (Macroscopic spectral concentration) There exists l<i- <i-|_<n+l with 
i+-i- > log'-'' ^'^ n such that \X,^{An+i)-Xt_ < (5exp(log°-^^ n){i+~ 

(ii) (Small inner products) There exists 1 < i- < io ^ I < io < i-t- < n with 
i+ — i- < log'^^/2 n such that 

(iii) (Large coefficient) We have 

|Cn+l,«+l|>«°-'- 

(iv) (Large eigenvalue) For some 1 < i < n + \ one has 

\\(A II :^ " cxp(- log"-^'^ n) 
\\i{An+i)\ > ^Y72 • 

(v) (Large inner product) There exists I < i < n such that 

\Ui{An) Xn\ > . 

(vi) (Large row) We have 

^ exp(- log°-^^ n) 
ll^«|| > jYj2 • 

(vii) (Large inner product near io ) There exists 1 < i < n with \i — io\ < log*^^ n 
such that 

|u,(A„)*X„|2>2"/2nlog°-«n. 



Proof. The proof of this proposition repeats the proof of [27l Lemma 47] in [27l 
Section 6] almost exactly. Only the following changes have to be made: 
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• We have the upper bound A.,+ (A„+i) - X^_{An+l) < (5(log^^ "°, 
which together with ([29]) forces i+ =/= n + 1 and thus iq < n as required. 

• The variable j now hes in the range 1 < j < n rather than en/ 10 < j < 
(1 -e/10)n. 

• i has to be defined as niax(i_ — 2'''^^, f ) rather than just i_ — 2*^^^ (and 

similarly for «++). 

□ 

Next, we need the following result that asserts that the events (i)-(vii) are rare: 

Proposition 5.4 (Bad events are rare). Suppose that no/2 < n < no and I < n/10, 
and set 6 := n'^'^ for some sufficiently small fixed k > 0. Let m > be such that 
2" < S-^/^. Then: 

(a) The events (i), (Hi), (iv), (v), (vi) in Lemma \5.3\ all fail with high proba- 
bility. 

(b) There is a constant C such that all the coefficients of the eigenvectors 

Uj{An) for 1 < j < n are of magnitude at most n~"'^/^log'^ n with over- 
whelming probability. Conditioning An to be a matrix with this property, 
the events (ii) and (vii) occur with a conditional probability of at most 
2"™ + 

(c) Furthermore, there is a constant C2 (depending on C',k,Ci) such that if 
I > C2 and An is conditioned as in (b), then (ii) and (vii) in fact occur 
with a conditional probability of at most 2""™ log^ ^ n + n~'^ . 

Proof. The proof of this proposition repeats the proof of [371 Proposition 49] in [371 
Section 7] almost exactly. Only the following changes have to be made: 

• All references to [27l Theorem 56] (i.e. Theorem II. 7|) need to be replaced 
with Theorem II. 101 

• The variable j now lies in the range \ < j < n rather than en/ 2 < j < 
(1 ~e/2)n. 

□ 

Given Lemma 15.31 and Proposition 15.41 the proof of Theorem 11.141 exactly follows 
the proof of Theorem 11.61 in 27, Section 3.5], with the following minor changes: 

• In the definition of the event £'„, the range en/2 < j < (1 — e/2)n needs to 
be expanded to 1 < j < n. 

• In the definition of the event Eq, the events that (|29p fail for some no — 
log^*^^ no < n < no have to be included; but these events occur with poly- 
nomially small probability, thanks to Proposition l5 . 1 1 and the union bound. 



This concludes the proof of Theorem 11.14! 
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6. The four moment theorem 



We now prove Theorein ll.131 As in \Tr, Section 3.3], the proof is based on two key 
propositions. The first proposition asserts that one can swap a single coefficient (or 
more precisely, two coefficients) of a (deterministic) matrix A as long as A obeys a 
certain "good configuration condition" : 

Proposition 6.1 (Replacement given a good configuration). There exists a positive 
constant Ci such that the following holds. Let k > 1 and £i > 0, and assume n 
sufficiently large depending on these parameters. Let \ < ii < . . . < i^ < n. For 
a complex parameter z, let A{z) be a (deterministic) family of n x n Hermitian 
matrices of the form 

A{z) — A{0) + zepe* +zeqe* 
where Cp, eq are unit vectors. We assume that for every I < j < k and every 
\z\ < whose real and imaginary parts are multiples of n^'^^ , we have 



• (Eigenvalue separation) For any 1 < i < n with \i — ij\ > n^^, we have 

(33) |A,(A(z))-A,^.(^(z))| >n-^M*-*jl- 

• (Delocalization at ij) If Pi - {A{z)) is the orthogonal projection to the eigenspace 
associated to Xij{A{z)), then 

(34) \\P,^{Aiz))epl\\P,^iAiz))eq\\<n-'/'+^K 

• For every a > 

(35) \\P,^,^iAiz))epim^,Mz))eq\\ < 2^/^n-^/^+-\ 

whenever Pi- is the orthogonal projection to the eigenspaces corresponding 
to eigenvalues Xi{A{z)) with 2" < |i — ij\ < 2"+^. 

We say that A(0), Cp, Cq are a good configuration forii, . . . ,ik if the above properties 
hold. Assuming this good configuration, then we have 

(36) E(F(C)) = EF(C') + 0(n-(''+i)/2+o(ei))^ 
whenever 

F{z) := G{K {A{z)), {A{z)),Q,, {A{z)), . . . , Q,, {A{z))), 

and 

G = G(\i^ , . . . , Ajj, , Qi^ , . ■ . , Qik ) 
is a smooth function from M.^ x M'ji — > K. that is supported on the region 

and obeys the derivative bounds 

for all < j < 5, and C' o-fe random variables with |C|j IC'I 5: n^^^'^^^ almost 
surely, which match to order r for some r — 2,3, 4. 



// G obeys the improved derivative bounds 
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for < j < 5 and some sufficiently large absolute constant C, then we can 
strengthen n-^''+^^/^+'='^^^'i m ^ to n-(''+i)/2-ei . 

Proof. See ^71 Proposition 43]. □ 

The second proposition asserts that these good configurations occur very fre- 
quently: 

Proposition 6.2 (Good configurations occur very fi'cquently). Let £i > and 
C,Ci,k > 1. Let 1 < zi < ... < ik < n, let 1 < p,q < n, /ei ei, . . . , e„ he the 
standard basis of <C"' , and let A{0) — {Cij)i<i,j<n be a random Hermitian matrix 
with independent upper-triangular entries and \ < n^/^ log'^ n for all 1 < «, j < 
n, with (pq = Cqp = 0, but with Qj having mean zero and variance 1 for all other ij, 
except on the diagonal where the variance is instead c for some absolute constant c > 
0, and also being distributed continuously in the complex plane. Then A(0), Cp, Cq 
obey the Good Configuration Condition in Theorem \6.1\ for ii, . . . ,ik and with the 
indicated value ofei,Ci with overwhelming probability. 

Proof. The proof of this proposition repeats the proof of [27l Proposition 44] in [27l 
Section 5] almost exactly. Only the following changes have to be made: 

• All references to [27l Theorem 56] (i.e. Theorem 11.71) need to be replaced 
with Theorem II. 101 

• All references to [27l Proposition 58] (i.e. Proposition II. 8|) need to be 
replaced with Proposition 1 1.1 21 

• The edge regions in which Xi(A{z)) do not fall inside the bulk region [(—2 + 
e')n, (2 — e')n] no longer need to be treated separately, thus simplifying the 
last paragraph of the proof somewhat. 

□ 

Given these two propositions, the proof of Theorem II . 131 repeats the proof of [571 
Theorem 15] in [27l Section 3.3] almost exactly. Only the following changes have 
to be made: 

• All references to [571 Proposition 44] need to be replaced with Proposition 



The proof of Theorem 1 1.1 31 is now complete. 
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